Theory of the GMM Kernel
نویسندگان
چکیده
We 1 develop some theoreti al results for a robust similarity measure named generalized minmax (GMM). This similarity has dire t appli ations in ma hine learning as a positive de nite kernel and an be e iently omputed via probabilisti hashing. Owing to the dis rete nature, the hashed values an also be used for e ient near neighbor sear h. We prove the theoreti al limit of GMM and the onsisten y result, assuming that the data follow an ellipti al distribution, whi h is a very general family of distributions and in ludes the multivariate tdistribution as a spe ial ase. The onsisten y result holds as long as the data have bounded rst moment (an assumption whi h essentially holds for datasets ommonly en ountered in pra ti e). Furthermore, we establish the asymptoti normality of GMM. Compared to the osine similarity whi h is routinely adopted in urrent pra ti e in statisti s and ma hine learning, the onsisten y of GMM requires mu h weaker onditions. Interestingly, when the data follow the t-distribution with ν degrees of freedom, GMM typi ally provides a better measure of similarity than osine roughly when ν < 8 (ν = 8 means the distribution is already very lose to a normal). These theoreti al results will help explain the re ent su ess of the use of the GMM kernel [11, 12, 13℄ in ma hine learning tasks.
منابع مشابه
Kernel GMM and its application to image binarization
Gaussian Mixture Model (GMM) is an efficient method for parametric clustering. However, traditional GMM can’t perform clustering well on data set with complex structure such as images. In this paper, kernel trick, successfully used by SVM and kernel PCA, is introduced into EM algorithm for solving parameter estimation of GMM, which is so called kernel GMM (kGMM). The basic idea of kernel GMM is...
متن کاملKernel Trick Embedded Gaussian Mixture Model
In this paper, we present a kernel trick embedded Gaussian Mixture Model (GMM), called kernel GMM. The basic idea is to embed kernel trick into EM algorithm and deduce a parameter estimation algorithm for GMM in feature space. Kernel GMM could be viewed as a Bayesian Kernel Method. Compared with most classical kernel methods, the proposed method can solve problems in probabilistic framework. Mo...
متن کاملGeneralized Intersection Kernel
Following the very recent line of work on the “generalized min-max” (GMM) kernel [7], this study proposes the “generalized intersection” (GInt) kernel and the related “normalized generalized min-max” (NGMM) kernel. In computer vision, the (histogram) intersection kernel has been popular, and the GInt kernel generalizes it to data which can have both negative and positive entries. Through an ext...
متن کاملTunable GMM Kernels
The recently proposed “generalized min-max” (GMM) kernel [9] can be efficiently linearized, with direct applications in large-scale statistical learning and fast near neighbor search. The linearized GMM kernel was extensively compared in [9] with linearized radial basis function (RBF) kernel. On a large number of classification tasks, the tuning-free GMM kernel performs (surprisingly) well comp...
متن کاملNystrom Method for Approximating the GMM Kernel
The GMM (generalized min-max) kernel was recently proposed [5] as a measure of data similarity and was demonstrated effective in machine learning tasks. In order to use the GMM kernel for large-scale datasets, the prior work resorted to the (generalized) consistent weighted sampling (GCWS) to convert the GMM kernel to linear kernel. We call this approach as “GMM-GCWS”. In the machine learning l...
متن کاملKernel Weighted GMM Estimators for Linear Time Series Models
This paper analyzes the higher order asymptotic properties of Generalized Method of Moments (GMM) estimators for linear time series models using many lags as instruments. A data dependent moment selection method based on minimizing the approximate mean squared error is developed. In addition, a new version of the GMM estimator based on kernel weighted moment conditions is proposed. It is shown ...
متن کامل